GBE 



Palindromic Genes in the Linear Mitochondrial Genome of the 
Nonphotosynthetic Green Alga Polytomella magna 

David Roy Smith 1 ' 1 ", Jimeng Huia 2 ' 3 ' 1 ", John M. Archibald 2 , and Robert W. Lee 3 '* 

department of Biology, University of Western Ontario, London, Ontario, Canada 

department of Biochemistry and Molecular Biology, Canadian Institute for Advanced Research, Integrated Microbial Biodiversity Program, 
Dalhousie University, Halifax, Nova Scotia, Canada 

department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada 

^Corresponding author: E-mail: robert.lee@dal.ca. 

"^These authors contributed equally to this work. 

Accepted: August 8, 2013 

Data deposition: Sequence data from this article have been deposited in GenBank under the accession KC733827. 

Abstract 

Organelle DNA is no stranger to palindromic repeats. But never has a mitochondrial or plastid genome been described in which every 
coding region is part of a distinct palindromic unit. While sequencing the mitochondrial DNA of the nonphotosynthetic green alga 
Polytomella magna, we uncovered precisely this type of genie arrangement. The P. magna mitochondrial genome is linear and made 
up entirely of palindromes, each containing 1 -7 unique coding regions. Consequently, every gene in the genome is duplicated and in 
an inverted orientation relative to its partner. And when these palindromic genes are folded into putative stem-loops, their predicted 
translational start sites are often positioned in the apex of the loop. Gel electrophoresis results support the linear, 28-kb monomeric 
conformation of the P. magna mitochondrial genome. Analyses of other Polytomella taxa suggest that palindromic mitochondrial 
genes were present in the ancestor of the Polytomella lineage and lost or retained to various degrees in extant species. The possible 
origins and consequences of this bizarre genomic architecture are discussed. 
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Introduction 

The words mitochondrial genome often evoke the image of 
a simple, circular molecule with a few dozen neatly arranged 
genes (Boore 1999), but most mitochondrial DNAs (mtDNAs) 
are far more complex than this. Since their origin from a bac- 
terial endosymbiont about two billion years ago, mtDNAs 
have adopted almost every shape, size, and organization 
imaginable (Burger et al. 2003). From the highly reduced 
mtDNA jigsaw puzzles of various protists (Nash et al. 2008; 
Vlcek et al. 201 1) to the massive multichromosomal mtDNAs 
of certain flowers (Sloan et al. 2012), mitochondrial genomes 
are anything but ordinary. They have influenced our under- 
standing of genomic architectural diversity across the eukary- 
otic tree of life and helped forge leading theories of genome 
evolution (Lynch et al. 2006; Gray et al. 2010). 

Here, underscoring just how bizarre mtDNAs can be, we 
present a linear mitochondrial chromosome with palindromic 



genes. This unusual genome comes from the green algal 
genus Polytomella. Closely related to popular model organ- 
isms, such as Chlamydomonas reinhardtii and Dunaliella salina 
(Smith et al. 2010), Polytomella is a group of poorly studied 
nonphotosynthetic unicells, which bear four flagella and are 
found in habitats rich in dissolved organic matter, such as 
freshwater pools of rotting vegetation (Pringsheim 1955). 

Previous work on Polytomella parva, P. piriformis, and 
P. capuana revealed atypical mtDNA features, including 
linear or linear fragmented conformations, closed-loop 
telomeres, scrambled and discontinuous rRNA-coding 
genes, and, in one case, an extreme nucleotide composition 
(Fan and Lee 2002; Fan et al. 2003; Smith and Lee 2008; 
Smith et al. 2010). In this study, genomic data from 
P. magna SAG 63-9 — a heretofore genetically unexplored 
lineage, isolated from the sap of an elm tree in Cambridge, 
England (Pringsheim 1 955) — provide a new take on gene and 
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repeat arrangement in mitochondrial genomes, and genomes 
as a whole. 

One Linear Chromosome, 10 Palindromic Repeats 

Polytomella magna, like previously described Polytomella taxa, 
has a linear mitochondrial genome with inverted-repeat telo- 
meres and ten unique genes, including fragmented and 
scrambled rRNA-coding regions (fig. The P. magna 
mtDNA is about twice the size (-28 kb) of those from other 
Polytomella lineages, which is the product of ten large and 
distinct duplications, each of which is in a tandem and 
inverted (i.e., palindromic) orientation (fig. ^A). These palin- 
dromes are AT-rich (avg. = 65%), -240-3,1 00 nt long 
(avg. = 2,165nt), and punctuated by short (2-102nt) 
stretches of nonrepeated sequence (supplementary table S1, 
Supplementary Material online). The inverted repeats that 
form each palindromic element are identical to one another 
and separated by 2-1 3 nt of AT-rich (75-1 00%), but noncon- 
served, sequence (supplementary table S1, Supplementary 
Material online). When ignoring the telomeres, palindromes 
cover >98% of the P. magna mtDNA. Six of the palindromes 
contain a single coding region, and four harbor 2-7 coding 
segments. Consequently, all 10 genes in the P. magna mito- 
chondrial genome are duplicated and in an inverted arrange- 
ment relative to their partners (fig. ^A). The duplicates of the 
tRNA- and rRNA-coding regions are identical and complete, 
suggesting that both copies are functional. For the seven 
duplicated protein-coding regions, however, only one of the 
pairs contains a stop and a start codon (supplementary table 
S2, Supplementary Material online), indicating that for every 
encoded protein there is a functional and nonfunctional 
version of the gene (fig. 1/\ and B). When the palindromic 
protein-coding genes are folded into putative stem-loop struc- 
tures, the start codon is sometimes found in the apex of the 
loop with the stop codon located in the intervening sequence 
between the different palindromes (fig. IB). Despite their pal- 
indromic nature, the genes in the P. magna mitochondrial 
genome have a similar organization to those of other 
Polytomella mtDNAs (fig. 1/\ and B). 

To the best of our knowledge, P. magna is the first eukary- 
ote shown to have an organelle genome consisting almost 
entirely of palindromes. Of the three other available 
Polytomella mtDNA sequences, those of P. parva and P. pir- 
iformis are devoid of palindromes, but that of P. capuana 
contains vestiges of palindromic genes: the 5'-ends of four 
protein- and five rRNA-coding regions form short (<30nt) 
inverted repeats with the adjacent intergenic sequences 
(fig. 1 0 (Smith and Lee 2008). The short palindromic elements 
in the P. capuana mtDNA have a similar orientation to those of 
P. magna (fig. 1 0, and some can be folded into hairpin struc- 
tures with the start codons of protein-coding genes positioned 
at the top of the loops (Smith and Lee 2008). Phylogenetic 
analyses (discussed later) support the view that palindromic 



genes arose early in Polytomella evolution and were subse- 
quently maintained or lost to varying degrees in extant 
lineages. 

Complex mtDNA Hybridization Pattern in Pulsed-Field 
Gels 

Gel electrophoresis analyses support the linear conformation 
and -28-kb size of the P. magna mitochondrial genome, but 
also hint at underlying structural complexity (fig. 2A and B\ 
supplementary figure S1, Supplementary Material online). 
Pulsed-f ield gel electrophoresis (PFGE) of P. magna DNA fol- 
lowed by Southern blot hybridization with a P. magna 
mtDNA-derived probe (cob) consistently revealed an -28-kb 
band, which co-migrated with linear markers (fig. 2A and B). 
This band was observed from in-gel digestion of P. magna cells 
(fig. 2A) as well as from samples of total cellular DNA and DNA 
isolated from a mitochondria-enriched fraction (fig. IB). In 
addition, PFGE of the latter two samples resulted in a series 
of slow-migrating bands that hybridized with the mtDNA 
probe and co-migrated with linear markers of 165 to 
>300kb (fig. 2B). These bands were not detected in the in- 
gel-lysis experiment, which, unlike the total DNA and mtDNA 
analyses, showed a strong mtDNA hybridization signal in the 
loading well, implying that a significant amount of mtDNA did 
not enter the gel (fig. 2A). 

Although often depicted as genome-sized molecules, 
mtDNAs can have complicated and dynamic architectures 
(Bendich 1993, 2004). For example, the organelle genomes 
of various land plants and fungi are known to exist as com- 
plex, multigenomic linear-branched structures, which, 
through recombination, can generate unit-sized chromo- 
somes (Oldenburg and Bendich 2004; Gerhold et al. 2010). 
These branched structures are frequently overlooked because 
they can get trapped in or close to the well during gel elec- 
trophoresis (Oldenburg and Bendich 2004). 

The slow-migrating mtDNA bands in the P. magna PFGE 
experiments using mitochondrial and total cellular DNA could 
correspond to partially disrupted massive linear-branched 
mtDNAs, which remain more intact and therefore well- 
bound when derived from embedded cells. However, assem- 
blies of these types of mtDNAs, because of their concatenated 
organization, typically give circular maps (Bendich 2004), not 
linear ones, like that derived for the P. magna mtDNA 
(fig. 1A). Moreover, cleavage of linear-branched organelle 
DNA with a single-cutter restriction enzyme normally produ- 
ces a genome-sized fragment as well as various degrees of 
smearing (Bendich 2004). Digestion of P. magna mtDNA with 
a single-cutter (and two-cutter) restriction endonuclease, 
followed by PFGE and Southern blot analyses, gave sub- 
genome-sized bands, which migrated in accordance with 
the mitochondrial restriction map (supplementary fig. S1, 
Supplementary Material online); no genome-sized or high- 
molecular-weight bands or smears were observed. 
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Fig. 1. — Polytomella mitochondrial DNA architecture. (A) The P. magna mitochondrial genome is an ~28-kb linear molecule with inverted-repeat 
telomeres (TIRs) and ten unique genes, including the small- and large-subunit rRNA genes, which are fragmented and scrambled into four and eight loci, 
respectively. The P. magna mtDNA contains ten palindromic repeats (boxed in dark or light gray and labeled with black circles). There is a functional (green) 
and nonfunctional copy (white and labeled with ^) of each protein-coding gene. (B) When the palindromic protein-coding regions are folded into putative 
stem-loop structures (stem and loop sizes as well as start/stop codons are shown), the start codon is often positioned in the apex of the loop and/or the stop 
codon is found in the intervening sequence between the different palindromes. (0 P- capuana has a 1 3-kb linear monomeric mtDNA with short palindromic 
repeats that include portions of coding DNA (boxed in gray and numbered with black circles based on their organization relative to P. magna). Polytomella 
parva and P. piriformis each have a linear bipartite mtDNA with chromosomes sizes of -13 and ~3 kb and no palindromic coding regions. 



Alternatively, the complex pattern of slower migrating 
bands could be a consequence of secondary structures. The 
palindromes within the P. magna mtDNA have the potential 
to form giant (>1.5kb) cruciforms, which could slow DNA 
migration within gels (fig. 2Q, as has been documented in a 
variety of other systems (Lilley and Clegg 1993; Oussatcheva 
et al. 1999; Sinden 1994; Stellwagen and Stellwagen 2009). 
The Gibbs free energy values, which can predict DNA duplex 
stability, of the folded single-stranded cruciforms are similar to 
the corresponding unfolded, double-stranded structures (sup- 
plementary table S3, Supplementary Material online), indicat- 
ing that under the appropriate conditions stem-loops may 
form easily and stably with the P. magna mtDNA, but further 
experiments are required to test this hypothesis. If cruciform 
structures were impeding mtDNA PFGE migration, they could 
have formed during DNA sample preparation (Courey and 
Wang 1983), and may not reflect the in vivo genome 
architecture. 

A New Polytomella Lineage 

Maximum-likelihood phylogenetic analyses, using 18S rDNA 
and mitochondrial proteins, demonstrate that P. magna SAG 
63-9 forms a distinct lineage within the Polytomella genus, 



sister to a clade containing all other known Polytomella species 
(fig. 3 and supplementary fig. S2, Supplementary Material 
online). This tree topology, when placed alongside available 
data on mtDNA architecture (fig. 3), supports the hypothesis 
that the ancestral Polytomella mitochondrial genome had pal- 
indromic loci, which were ultimately preserved in the 
P. magna lineage, but partially and completely lost in the P. 
capuana and P. parva/piriformis lineages, respectively. The or- 
igins of other Polytomella mtDNA features, including a linear 
bipartite structure, are highlighted in figure 3. 

In his original description of P. magna, Pringsheim (1955) 
noted that it was larger than other identified Polytomella spe- 
cies and the only one with a discernable eyespot. Our experi- 
ences of culturing P. magna SAG 63-9 were consistent with 
Pringsheim's observations. We found that SAG 63-9, when 
grown on our standard Polytomella medium (Sheeler et al. 
1968), was conspicuously larger than strains from the three 
other known Polytomella lineages (Smith et al. 2010), had a 
visible eyespot, and was pinkish in color (supplementary fig. 
S3, Supplementary Material online) — presumably because of 
the carotenoids in the eyespot (Kreimer 2009). The Culture 
Collection of Algae at the University of Gottingen (SAG) main- 
tains, as of July 2013, one other strain labeled P. magna: 
SAG 63-4. Previous studies on the mtDNA sequence and 



Genome Biol. Evol. 5(9): 1 661-1 667. doi:10.1093/gbe/evt122 Advance Access publication August 1 1, 2013 



1663 



Smith et al. 



GBE 



/ // // 



B 



291_ 
243 — 
194 — 

146 — 




291 — 
243— 



194— 



146- 



-28 97 _ 

49— 
23— 




C Hypothetical mtDNA 
conformation 



= >300 



:>250 

-225 

-210 

-165 



complex secondary structure 
moderate secondary structure 



minimal secondary structure 



28 



no secondary structure 



Fig. 2. — PFGE and Southern blot analysis of P. magna DNA. (A) PFGE and Southern blot of DNA resulting from the digestion of agarose-embedded 
P. magna cells (gel: 1 % pulsed-f ield certified agarose; chiller temperature: 14°C; running buffer: 0.5x TBE; run time: 10 h 30min; initial switch time: 0.1 s; 
final switch time: 26s; gradient: 6 V/cm; angle: 120°). (B) PFGE and Southern blot of purified mitochondrial and total cellular P. magna DNA (gel: 1 % pulsed 
field certified agarose; chiller temperature: 14°C; running buffer: 0.5 x TBE; run time: 12 h; initial switch time: 15 s; final switch time: 15 s; gradient: 6 V/cm; 
angle: 120°). Lanes are as follows: (/) New England Biolabs low range PFG marker; (/'/) agarose-embedded cells; (///) purified mtDNA; (/V) total cellular DNA. 
Asterisk denotes Southern blot of given lane using an mtDNA-derived probe (cob). (Q Hypothetical mtDNA configuration. Note — cz = compression zone. 
Sizes of bands are labeled in kilobases. 



structure of SAG 63-4 showed that it belongs to the P. parva 
lineage (Mallet and Lee 2006; Smith and Lee 2011). 
Consistent with this view, we found that SAG 63-4 had no 
discernable eyespot, was smaller than SAG 63-9 and indistin- 
guishable in size from P. parva (SAG 63-3), and gave an off- 
white-colored pellet, with no visible carotenoids, similar to a 
pellet of P. parva (supplementary fig. S3, Supplementary 
Material online). 

Origin of Palindromic Genes 

How did P. magna acquire such a peculiar mtDNA organiza- 
tion? Palindromes can be found, to varying degrees, in most 
genomes, but rarely do they blanket entire chromosomes, as 
observed for P. magna (fig. 1/\). Several elegant hypotheses 
have been put forth for the emergence of large palindromic 
DNA elements; for example, one model begins with strand 



annealing at a short DNA inverted repeat after a double- 
stranded break, whereas another involves cruciform extrusion 
and resolution (Tanaka and Yao 2009). Inspection of the 
P. magna mtDNA offers no obvious solution to the origin of 
its palindromes. However, the fact that all coding regions are 
duplicated suggests that the 10 palindromic units may have 
arisen by inter- and intra-recombination events following an 
mtDNA genome duplication, similar to the that proposed for 
the formation of the mtDNA replication-intermediate, head- 
to-head linear dimer in Paramecium aurelia (Pritchard and 
Cummings 1981). Alternatively, the genome-wide formation 
of palindromic sequences in the P. magna mtDNA may have 
been the result of repair activity to overcome DNA-replication- 
disrupting cruciform structures; these could have emerged 
following a heat shock from previously benign short inverted 
repeat sequences of canonical double-stranded DNA structure 
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Fig. 3. — Maximum-likelihood phylogeny inferred from 18S rDNA sequences. Phylogenetic tree of chlamydomonadalean green algae, with the chlor- 
ophycean Scenedesmus obliquus and the nonphotosynthetic trebouxiophyte Prototheca wickerhamii used as outgroup. The same tree topology was 
obtained with a maximum-likelihood phylogeny inferred from the concatenated amino acid sequences of seven mtDNA-encoded proteins (supplementary 
fig. S2, Supplementary Material online). Bootstrap support values are indicated at each node: top ones and lower ones correspond to the 18S rDNA and 
mitochondrial protein trees, respectively. Hypothetical position of the origin, partial loss, and complete loss of palindromic genes are marked on the tree with 
a square, diamond, and circle, respectively. Note — Partial loss of palindromic genes could have occurred in the branch leading to P. capuana, P. piriformis, and 
P. parva (a) or alternatively it could have occurred in the lineage leading to P. capuana (b). Scale bar represents the estimated number of nucleotide 
substitutions per site. 



(SantaLucia and Hicks 2004) that arose in some noncoding 
regions of the P. magna mtDNA. Homologous recombination 
might be quite high within the P. magna mtDNA: the inverted 
repeats that make up the palindromic elements are 100% 
identical (supplementary table S1, Supplementary Material 
online), which is likely the result of recurrent gene conversion 
between the palindromic units. High levels of gene conversion 
could also explain why the palindromes persist within the 
genome (Marechal and Brisson 2010). 

Palindromic organelle DNA repeats have been uncovered 
in other members of the Chlamydomonadales (the order to 
which P. magna belongs), including C. reinhardtii, D. salina, 
and Volvox carteri (Maul et al. 2002; Smith and Lee 2009; 
Smith et al. 2010), and are thought to have shaped their 
mtDNA evolution (Nedelcu and Lee 1998) and play a role in 
mitochondrial gene expression (Gray and Boer 1988). The 
fact that some of the palindromes within the P. magna 
mtDNA can be folded into hairpins, with the start codons 
of protein-coding regions located in the apex of the loop 
(fig. M3), is suggestive of a role in gene expression. 
Unraveling the transcriptional architecture of P. magna, 
however, will not be a trivial task. Chlamydomonadalean 



transcriptome sequencing projects typically yield almost 
complete coverage of the mtDNA, including intergenic re- 
gions and telomeres, making it difficult to predict transcrip- 
tional units and processing sites (e.g., see GenBank 
accessions ERX177535-ERX1 77582). 

Genetic palindromes are a hot topic. They are associated 
with a diversity of molecular processes, from DNA replication 
to major chromosomal rearrangements (Tanaka et al. 2002; 
Paek et al. 2009; Lavrov et al. 2012), are implicated in various 
human diseases (Tanaka et al. 2006; Voineagu et al. 2008), 
and have been the focus of international scientific meetings 
(Smith 2008). Overall, the P. magna mtDNA provides a fresh 
view of these important repetitive elements. 



Methods and Materials 

Polytomella magna SAG 63-9, made axenic following the 
method of Mallet and Lee (2006), was grown in darkness 
at 22 °C in the Polytomella medium of Sheeler et al. (1968) 
and harvested in the late logarithmic growth phase 
(OD750 nm ~ 0.35). A mitochondria-enriched fraction of 
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P. magna was generated using Procedure B of Ryan et al. 
(1978), without the step employing CsCI gradient centrifuga- 
tion. DNA was extracted from a cellular and a mitochondria- 
enriched fraction using the CTAB method (Reineke et al. 
1998). DNA for PFGE plugs were prepared as previously 
described (Tanifuji et al. 2006). DNA was transferred to 
nylon membranes (Amersham Bioscience, NJ, USA) and 
nonradioactive Southern blot hybridizations were performed 
with a digoxigenin (DIG)-labeled, polymerase chain reaction- 
amplified P. magna mtDNA probe. Hybridization signals were 
detected using the standard protocol of the DIG detection 
kit (Roche Diagnostics, IN, USA) and Fuji Super RX medical 
X-ray film. 

Library preparation and paired-end lllumina HiSeq 2000 
sequencing (100nt reads; -500 nt inserts) were performed 
by BGI Americas (MA, USA), using total DNA, isolated with 
the DNeasy Plant Mini Kit (Qiagen, MD, USA). The P. magna 
sequence data (~5Gb) were assembled de novo with Ray 
v1.2.1 (Boisvert et al. 2010), using a /c-mer of 21, and sepa- 
rately with PASHA (Liu et al. 201 1), using a /c-mer of 31 . The 
resulting Ray- and PASHA-generated contigs were indepen- 
dently scanned for mitochondrial sequences using BLAST and 
the P. capuana and P. parva mtDNAs as search queries. 
Contigs matching to mtDNA were identified in each data 
set. These contigs were extended using the paired-end data 
and the Map to reference program in Geneious v6.0.6 
(Biomatters Ltd., Auckland, New Zealand), giving (in both 
cases) a complete, linear-mapping mitochondrial genome 
with telomeres (GenBank accession KC733827). Maximum 
likelihood phylogenetic analyses were performed with the 
PhyML 3.0 web server (Guindon et al. 2010), and the robust- 
ness of individual nodes on the tree were assessed using 100 
bootstrap replicates. 

Supplementary Material 

Supplementary figures S1-S3 and tables S1-S3 are available 
at Genome Biology and Evolution online (http:/A/vww.gbe. 
oxfordjournals.org/). 
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